Algorithmic Engineering Towards More Efficient Key-Value Systems
نویسندگان
چکیده
Distributed key-value systems have been widely used as elemental components of many Internet-scale services at sites such as Amazon, Facebook and Twitter. This thesis examines a system design approach to scale existing key-value systems, both horizontally and vertically, by carefully engineering and integrating techniques that are grounded in recent theory but also informed by underlying architectures and expected workloads in practice. As a case study, we re-design FAWN-KV—a distributed key-value cluster consisting of “wimpy” key-value nodes—to use less memory but achieve higher throughput even in the worst case. First, to improve the worst-case throughput of a FAWN-KV system, we propose a randomized load balancing scheme that can fully utilize all the nodes regardless of their query distribution. We analytically prove and empirically demonstrate that deploying a very small but extremely fast load balancer at FAWN-KV can effectively prevent uneven or dynamic workloads creating hotspots on individual nodes. Moreover, our analysis provides service designers a mathematically tractable approach to estimate the worst-case throughput and also avoid drastic overprovisioning in similar distributed key-value systems. Second, to implement the high-speed load balancer and also to improve the space efficiency of individual key-value nodes, we propose novel data structures and algorithms, including the cuckoo filter, a Bloom filter replacement that is high-speed, highly compact and delete-supporting, and optimistic cuckoo hashing, a fast and space-efficient hashing scheme that scales on multiple CPUs. Both algorithms are built upon conventional cuckoo hashing but are optimized for our target architectures and workloads. Using them as building blocks, we design and implement MemC3 to serve transient data from DRAM with high throughput and low-latency retrievals, and SILT to provide cost-effective access to persistent data on flash storage with extremely small memory footprint (e.g., 0.7 bytes per entry).
منابع مشابه
Algorithmic Trading and Information
We examine algorithmic trades (AT) and their role in the price discovery process in the 30 DAX stocks on the Deutsche Boerse in January 2008. AT liquidity demand represents 52% of volume and AT supplies liquidity on 50% of volume. AT act strategically by monitoring the market for liquidity and deviations of price from fundamental value. AT consume liquidity when it is cheap and supply liquidity...
متن کاملSituation and Perspective of Knowledge Engineering
Knowledge Engineering was in the past primarily concerned with building and developing knowledge-based systems, an objective which puts Knowledge Engineering in a niche of the world-wide research efforts at best. This has changed dramatically: Knowledge Engineering is now a key technology in the upcoming knowledge society. Companies are recognizing knowledge as their key assets, which have to b...
متن کاملThe Moment of Environmental Ethics the Moment of Drift?
The question of whether architectural creativity is more of an artistic or engineering nature is one with a long history but also one with no conclusive answer. The art camp would argue that technology should be treated as a means towards and end, and that technology alone cannot give meaning to our lives. The engineering camp on the other hand would argue that good problem-solving result...
متن کاملRFID role in efficient management of healthcare systems: a system thinking perspective
Abstract Purpose of this paper: This paper presents an analysis toward understanding the business value components that a health care organization can drive by adopting RFID technology into its system. This researcher proposes a framework for evaluating the business value of RFID technology. To do so, emphasis is put on delivering business value through refining business processes and expandin...
متن کاملLinear and Algorithmic Formulation of Co-operative Computation in Neural Nets
Linear and Algorithmic Formulation of Co-operative Computation in Neural Nets p. 2 Algebraic System Modelling and Implementation p. 21 Towards an "Erlangen Program" for General Linear Systems Theory p. 32 Geometric Theory of Nonlinear Dynamical Networks p. 52 Multidimensional Constant Linear Systems p. 66 Computer Aided Analysis and Design of Time-Varying Systems p. 73 Towards a Computer Aided ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013